Clustering via Concave Minimization

نویسندگان

  • Paul S. Bradley
  • Olvi L. Mangasarian
  • William Nick Street
چکیده

w. N. Street Computer Science Department Oklahoma State University 205 Mathematical Sciences Stillwater, OK 74078 email: nstreet@es. okstate. edu The problem of assigning m points in the n-dimensional real space Rn to k clusters is formulated as that of determining k centers in Rn such that the sum of distances of each point to the nearest center is minimized. If a polyhedral distance is used, the problem can be formulated as that of minimizing a piecewise-linear concave function on a polyhedral set which is shown to be equivalent to a bilinear program: minimizing a bilinear function on a polyhedral set. A fast finite k-Median Algorithm consisting of solving few linear programs in closed form leads to a stationary point of the bilinear program. Computational testing on a number of realworld databases was carried out. On the Wisconsin Diagnostic Breast Cancer (WDBC) database, k-Median training set correctness was comparable to that of the k-Mean Algorithm, however its testing set correctness was better. Additionally, on the Wisconsin Prognostic Breast Cancer (WPBC) database, distinct and clinically important survival curves were extracted by the k-Median Algorithm, whereas the k-Mean Algorithm failed to obtain such distinct survival curves for the same database.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Minimization of Concave Information Functionals for Unsupervised Classification via Decision Trees

A popular method for unsupervised classification of high-dimensional data via decision trees is characterized as minimizing the empirical estimate of a concave information functional. It is shown that minimization of such functionals under the true distributions leads to perfect classification.

متن کامل

Machine Learning via Polyhedral Concave Minimization

Two fundamental problems of machine learning misclassi cation minimization and feature selection are formulated as the minimization of a concave function on a polyhedral set Other formulations of these problems utilize linear programs with equilibrium constraints which are generally intractable In contrast for the proposed concave minimization formulation a successive linearization algorithm wi...

متن کامل

Semi-Supervised Support Vector Machines for Unlabeled Data Classification

A concave minimization approach is proposed for classifying unlabeled data based on the following ideas: (i) A small representative percentage (5% to 10%) of the unlabeled data is chosen by a clustering algorithm and given to an expert or oracle to label. (ii) A linear support vector machine is trained using the small labeled sample while simultaneously assigning the remaining bulk of the unlab...

متن کامل

Feature Selection via Concave Minimization and Support Vector Machines

Computational comparison is made between two feature selection approaches for nding a separating plane that discriminates between two point sets in an n-dimensional feature space that utilizes as few of the n features (dimensions) as possible. In the concave minimization approach [19, 5] a separating plane is generated by minimizing a weighted sum of distances of misclassi ed points to two para...

متن کامل

Absolute value equation solution via concave minimization

The NP-hard absolute value equation (AVE) Ax − |x| = b where A ∈ R and b ∈ Rn is solved by a succession of linear programs. The linear programs arise from a reformulation of the AVE as the minimization of a piecewise-linear concave function on a polyhedral set and solving the latter by successive linearization. A simple MATLAB implementation of the successive linearization algorithm solved 100 ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996